Age Dependent Document Priors in Link Structure Analysis

نویسندگان

  • Claudia Hauff
  • Leif Azzopardi
چکیده

Much research has been performed investigating how links between web pages can be exploited in an Information Retrieval setting [1, 4]. In this poster, we investigate the application of the Barabási-Albert model to link structure analysis on a collection of web documents within the language modeling framework. Our model utilizes the web structure as described by a Scale Free Network and derives a document prior based on a web document’s age and linkage. Preliminary experiments indicate the utility of our approach over other current link structure algorithms and warrants further research.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Language Models for Searching in Web Corpora

We describe our participation in the TREC 2004 Web and Terabyte tracks. For the web track, we employ mixture language models based on document full-text, incoming anchortext, and documents titles, with a range of webcentric priors. We provide a detailed analysis of the effect on relevance of document length, URL structure, and link topology. The resulting web-centric priors are applied to three...

متن کامل

Combining Structural Information and the Use of Priors in Mixed Named-Page and Homepage Finding

This paper presents Carnegie Mellon University’s experiments on the mixed named-page and homepage finding task of the TREC 12 Web Track. Our results were strong; we achieved the success using language models estimated from combining information from document text, in-link text, and information present in the structure of the documents. We also present experiments using expectations about poster...

متن کامل

Language Model Document Priors based on Citation and Co-citation Analysis

Citation, an integral component of research papers, implies certain kind of relevance that is not well captured in current Information Retrieval (IR) researches. In this paper, we explore ingesting citation and co-citation analysis results into IR modeling process. We operationalize on going beyond the general uniform document prior assumption in language modeling framework through deriving doc...

متن کامل

Persian Printed Document Analysis and Page Segmentation

This paper presents, a hybrid method, low-resolution and high-resolution, for Persian page segmentation. In the low-resolution page segmentation, a pyramidal image structure is constructed for multiscale analysis and segments document image to a set of regions. By high-resolution page segmentation, by connected components analysis, each region is segmented to homogeneous regions and identifyi...

متن کامل

Investigating the Relationship between Population Structure and Poverty

Introduction: Poverty reduction is one of the important macroeconomic goals of any country, but achieving this important issue requires examining the factors affecting it. Changing the age structure of the population is one of the effective factors in reducing poverty in countries. Therefore, governments can make the most of their population, given the capacity of countries and providing the ne...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005